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Abstract 

The composition of stable-isotope labelled isotopologues/isotopomers in metabolic 
products can be measured by mass spectrometry and supports the analysis of pathways 
and fluxes. As a prerequisite, the original mass spectra have to be processed, managed 
and stored to rapidly calculate, analyse and compare isotopomer enrichments to study, 
for instance, bacterial metabolism in infection. For such applications, we provide here 
the database application 'Isotopo'. This software package includes (i) a database to store 
and process isotopomer data, (ii) a parser to upload and translate different data formats 
for such data and (iii) an improved application to process and convert signal intensities 
from mass spectra of ^^C-labelled metabolites such as tertbutyldimethylsilyl-derivatives 
of amino acids. Relative mass intensities and isotopomer distributions are calculated 
applying a partial least square method with iterative refinement for high precision data. 
The data output includes formats such as graphs for overall enrichments in amino acids. 
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The package is user-friendly for easy and robust data management of multiple 
experiments. 

Availability: The 'Isotopo' software is available at the following web link (section 
Download): http://spp1316.uni-wuerzburg.de/bioinformatics/isotopo/. The package con- 
tains three additional files: software executable setup (installer), one data set file 
(discussed in this article) and one excel file (which can be used to convert data from excel 
to '.iso' format). The 'Isotopo' software is compatible only with the Microsoft Windows 
operating system. 

Database URL: http://spp1316.uni-wuerzburg.de/bioinformatics/isotopo/. 



Introduction 

Incorporation experiments using simple stable-isotope 
labelled precursors can result in unique isotopologue pat- 
terns of metabolic products reflecting the biosynthetic his- 
tory of the metabolite under study. On the basis of 
multiple metabolite analysis, isotopologue patterns can be 
used to reconstruct metabolic pathways and interactions 
even under complex experimental conditions (1). For sim- 
ple settings, such as bacteria grown under chemostat con- 
ditions with a single carbon source, it is possible to derive 
accurate metabolic flux distributions from such data (2-4). 
In practice, the computational analysis of such isotopo- 
logue data is still a challenge (5-8). Extending previous ef- 
forts in mass isotopomer distribution analysis [MIDA; 
(1-8)], we first improved our own algorithm and software 
interface for mass spectrometry (MS) data (9) by including 
a partial least square calculation. An important challenge 
in isotopologue profiling is that metabolite data have to be 
collected on a large-scale to provide the required data basis 
for the interpretation of the underlying metabolic net- 
works. Therefore, we have created a database manager 
that enables rapid conversion and storage of multiple iso- 
topomer data. This includes a specific parser that allows 
data exchange between different formats as well as for 
direct reading of different data formats into 'Isotopo'. 
Moreover, the output formats have been extended, 
enabling data comparison on the basis of graphs display- 
ing the ^^C-profiles in compound families such as amino 
acids. 

As a result, the database application 'Isotopo' has been 
established including (i) a large-scale database to store and 
process multiple isotopologue data, (ii) a useful parser to 
upload and translate different data formats for such data 
and (iii) an improved application to directly process MS 
spectra of '^C-labelled compounds such as tertbutyldime- 
thylsilyl (TBDMS)-derivatives of amino acids. Relative 
mass intensities are herein calculated applying a partial 
least square method and, for optimal resolution of 
the isotopologue data, iterative refinement is effected. 



The freely available package includes a tutorial and allows 
robust data management of multiple labelling experiments. 

Motivation and results 

Different data sets were tested (metabolites from different 
bacterial strains and media). To apply Isotopo, experimental 
data have first to be collected from gas chromatography- 
mass spectrometry (GC-MS) experiments. The experimental 
process using Isotopo consists of three major steps: data 
preparation, analysis and visualization as shown in Figure 1, 
following the implemented workflow as shown in Figure 2a. 

Example results and graphics are shown in Figure 2b. 
Representation includes mountain plot and bar plot, so 
that different spectra based on the absolute enrichment val- 
ues, and abundances against mass-to-charge ratios. The ex- 
ample is based on the computational analysis of seven 
different TBDMS-derivatives of amino acids (alanine-260, 
the fragment number is 260, adding one molecule alanine, 
two molecules TMS (trimethylsilyl) and subtracting one 
methyl group, the fragments are thus numbered according 
to their total molecular weight, fragment charge z = 1; ac- 
cordingly, fragment numbers are given for glycine-246, 
lysine-300, aspartic acid-418, threonine-404, proline-184 
and tyrosin-302; format: amino acid-fragment number) 
from labelling experiments with Salmonella (details in 
Supplementary Material). Based on the obtained results, a 
clear difference is observed in actual (experimental data) 
and calculated (estimated) natural and relative abundance 
values. The database applications were successfully used 
in various experiments on isotopologue data to study the 
metabolism of bacteria under different growth conditions. 

Before we started our database application effort, data 
were scattered in different formats, no database was avail- 
able and results were calculated by in-house software using 
Excel macros or resorting to more limited academic or com- 
mercial solutions. For instance, the program Envelope is a 
visualization software package for already calculated isotope 
distributions. With a user-friendly interface, the displayed 
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Figure 1. Components of the Isotopo: This figure is the abstract, visual presentation of the workflow of the components (Microsoft Excel Data file, 
Data Reader, Database, Data Manager and Data Analyser) of the Isotopo. 
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Figure 2. Work flow of Isotopo and presentation of results: (A) Visual presentation of the unified mark-up language- based flow chart. The imple- 
mented flow of operations performed during experimental data input, processing, analysis and visualization is given. The presented workflow starts 
with the input of raw data, which are first validated and then analysed to estimate the natural abundance values (using abundance matrix and bino- 
mial expansion), relative intensity values and relative abundance values (using Brauman's least square algorithm). Finally, the analysed results are 
visualized as a bar plot, stored in the internal storage and available to export. (B) The bar plot is based on the estimated absolute enrichment values 
of various amino acids from a labelling experiment with Salmonella typhimurium: alanine-260, glycine-246, lysine-300, aspartic acid-418, threonine- 
404, proline-184 and tyrosine-302, whereas the mountain plot gives in the example a detailed view based on the calculated natural and relative abun- 
dance values of alanine-260. Bar and mountain plots are different options to visualize different data sets. 
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isotope distributions change in near real time in response to 
user-controlled changes in the labelling parameters using 
continuously variable slider controls or text input boxes 
(10). The Theoretical Isotope Generator is an application de- 
veloped to gain isotopologue-related information on mol- 
ecules and their relative intensity for educational purposes 
(chemists, students, lecturers and researchers) (11). For com- 
parison, METATOOL, one of a number of related tools for 
modelling purposes, allows calculation of fluxes in various 
metabolic networks relying on biochemical reactions, exter- 
nal and internal metabolites but requiring and relying on no 
further experimental data (12). The software Isodyn (13) 
simulates the dynamics of isotopic isomer distribution in 
central metabolic pathways, and is further enhanced by algo- 
rithms facilitating the transition between various analysed 
metabolic schemes as well as tools for model discrimination. 
With the software least square mass isotopomers distribu- 
tion analysis (LS-MIDA) (9), we studied an implementation 
of a MIDA calculation software; however, the refined 
Isotopo database appUcation presented here provides, to- 
gether with the database and its management, an improved 
MIDA calculation algorithm (rectangular matrix, partial 
least square calculation) with iterative refinement for auto- 
matic, direct and everyday isotopologue data analysis and 
management. The Isotopo database application now delivers 
full control over the data, comparison between data sets as 
well as data management. All these options are important 
enhancements for long-term and accurate usage in large- 
scale isotopologue analyses and experiments. 

By these and similar experiments on Salmonella and 
Listeria, a detailed analysis of amino acid metabolism and 
its connection to carbohydrate metabolism in infections 
was possible (14, 15). In general, insights provided by the 
new software on bacterial metabolism (e.g. Staphylococcus 
aureus) include better understanding of metabolic changes 
and adaptation, alternative processing routes and energy 
and yield considerations. 

Specifically, we wanted to investigate how Salmonella ob- 
tain nutrients and metabolites in the Salmonella-containing 
vacuole. Alternative hypotheses envision a direct transport 
and redirection of nutrients from the Salmonella cytoplasm 
into the vacuole or stress the autotrophic potential of the Sal- 
monella, so that only a basic C- or N-source is required. To 
get an insight into this, numerous data sets had to be 
assembled applying our application. Current data indicate, for 
a number of amino acids (e.g. glutamate, aspartate), that rapid 
and direct synthesis is possible. Additionally, a clear indication 
for transport processes (e.g. glucose transporters) could be ob- 
tained. The database application has now been made open for 
the community, so that of course completely different biolo- 
gical questions and experiments can be efficiently stored and 
analysed through exploiting isotopologue data. 



Methodology 

The Isotopo is developed using the Microsoft C# (sharp) 
programming language and Microsoft Dot Net frame- 
work, which is why it is compatible only with the 
Microsoft operating systems. The Isotopo and all its 
provided material are available for free usage, and by 
downloading and using it, users agree to these license 
conditions. 

Data analyser 

This is the most important (backbone) module of the appli- 
cation, which provides options for the experimental data 
load, analysis and visualization (Figure 3). It allows the 
user to enter experimental data manually or to load data 
from existing files and to connect the application to the 
database server (if available) to fetch data. We have intro- 
duced new data standards for efficient data extraction and 
effective data management, which speed up preprocessed 
data processing operations. We have standardized data in 
two different formats: "' .iso' (this is the extension for nor- 
mal isotope files, which can be used in both data manage- 
ment and data analyser modules) and '*.isx' (this is the 
extension for data generated by the data reader, after pars- 
ing existing data lists, e.g. from Excel files). The major rea- 
son for this new standardization of experimental raw data 
is to categorize the data in personalized formats to fasten 
the process of data processing. The '*.iso' and "Msx' for- 
mats consist of the following by semicolon ';' separated 
elements: metabolite name, mass-to-charge ratio values, 
first, second and third set of relative intensity values (from 
three independent measurements), standard relative inten- 
sity values, atom mass, fragment, fixed value and date 
and time. Multiple metabolite values are separated by an 
asterisk 

The data analyser now processes the experimental data, 
individual data elements (one single metabolite informa- 
tion, for example, for the alanine-260 fragment) and data 
sets with multiple data entries at once (sets of different me- 
tabolites, for example, glycine-246, lysine-300, aspartic 
acid-418). Input data are based on the following informa- 
tion: metabolite name, mass-to-charge ratio values, num- 
ber of fragments, mass values, number of atoms and actual 
and standard intensity values. In return after the analysis, 
it produces the following results: minimum to maximum 
mass values of a selected fragment (M.i, Mq, M,-n,,x), mean 
relative intensity values of mass signals, natural abundance 
values and relative abundance values, which indicate here 
and throughout the manuscript relative intensities of iso- 
topomers in labelled compounds. Furthermore, it gives ab- 
solute enrichments of estimated natural and relative 
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abundance values, so that overall abundances of isotopom- 
ers in natural abundance compounds are compared with 
mean values and standard deviation of experimentally 
measured (three times) abundance values. 

Resulting values are presented in six different subsec- 
tions of the data analyser: The fragment viewer provides 
the information about the number of groups with each 
group's respective natural and relative abundance values 
(the latter according to the experimental data; together, 
they allow an estimate of the absolute enrichment); 
spectrum viewer (provides the visualization of produced re- 
sults in different modes: bar, curve, filled curve and labels); 
results viewer provides all resultant data for all sets of experi- 
ments. This helps, especially when the user needs to analyse 
a complete data set consisting of multiple data entries. 
Furthermore, entered, loaded, edited or analysed results data 
can be saved in data files and further reused as required. 

The detailed methodological, mathematical and fea- 
ture-based information, along with produced results (using 
GC-MS preprocessed experimental data), is given in the 
Supplementary Material. 

To meet the aforementioned application's goals, the 
graphical user interface (GUI) of Isotopo data analyser 
consists of nine main controls (open data file, clear all text 



controls, measure selected data, process all data, remove 
selected data, open data manager, close Isotopo, select 
values and results) and seven views (Isotopo Analyser, 
Fragment Viewer, Spectrum Viewer, Result Viewer, Relative 
Abundance 1, Relative Abundance 2 and Relative 
Abundance 3), as explained in the attached Supplementary 
Material. 

Database manager 

One major focus during the development of a new sys- 
tem towards MIDA is to create a system with fewer 
dependencies, especially in terms of data management, 
analysis, visualization and sharing. During the software 
planning, we found that the major issues are not related to 
the archiving of data (e.g. which database to use), but to 
the handling of data. In our case, the experimental (GC- 
MS) raw data typically come in Microsoft Excel sheets, 
separated into the different formats, which later need to be 
analysed. Furthermore, the aim was to have a system with 
a personalized data management module, which can easily 
work offline as well. Keeping these requirements in mind, 
we designed and implemented the new system, which pro- 
vides a third-party-independent personalized file-based 
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Figure 4. Isotopo database manager. GUI. The top right part of the GUI allows the user to load an existing data file. The top left part of the GUI allows 
the user to enter new, edit selected and delete values shown in the bottom left list view. The bottom right part allows the user to connect to the data- 
base, load and delete data from database. 



data management system. It allows the user to create, edit, 
load (previously created), delete and update data files with 
defined extension ("■'.iso'). 

One cannot deny the importance of available data man- 
agement systems (e.g. MySQL, Oracle, PostgreSQL) to 
archive, access and secure huge amounts of data. Based on 
our previous experience, we preferred here a MySQL data- 
base solution as it provided Isotopo with a flexible and per- 
formant solution. To further extend the scope of data 
management and feature-based development, we have also 
designed a new database scheme (given in the attached 
Supplementary Material, Section 7). The data may be kept 
in an in-house, intranet database, and when desired 
the system is setup such that only authorized users can con- 
nect to the created Isotopo database server (IsoDB) to get 
and set up the data. It allows the user to perform direct 
data manipulation and management operations using 
the created database, which can later be directly accessed 
by the data analyser module for further processing. 
However, both options — connection to the Internet (web 
server option) and direct local storage of the data — are 
provided, which allow more flexibility than one option 
alone. 



To meet the aforementioned application's goals, the 
GUI of Isotopo Data Manager (Figure 4) consists of 16 
main controls (open data file, clear all text controls, close 
Isotopo data manager, add new values, update edited val- 
ues, clear text fields, save data in file, select values to edit, 
delete values, create new data file, select source directory, 
save file, cancel creating file, data view. Open Isotopo 
Data Analyser and Open Isotopo Data Viewer), as ex- 
plained in the attached Supplementary Material. 

Data format parser 

This utility module of the Isotopo application (Figure 5) 
helps the user in transferring data from a different data 
format (e.g. Microsoft Excel) to the "•''.isx" data format. 
This module is based on a newly programed, dedicated 
data classifier based on supervised machine learning 
principles. 

The programed data classifier automatically merges all 
the data from three sheets into a simple and readable format 
(Figure 3). Preprocessed GC-MS data from Excel data files 
are rapidly parsed and integrated into the database. The 
data are divided into three parts: Supplementary Table SI 



Database, Vol. 2014, Article ID bau077 



Page 7 of 8 



^ totopQ 4 - Ddtd Redder 

1 ^ o 



Melabole 

Bai.-!il!*™i 

ay-2iap2| 
av-2J6p1-3 
LSU-2IXI (CZ-e) 
lj!u-274 pS-e 

Ie-20a^«) 
le-274P2-«) 



231.1W232.1#23.. 
259.15«a60.1tf2S.., 
217.15tt218.1#21.. 
245,15I12«.1»24.., 
139,25tOTJ.1S»2.., 
273.25#274.15(ffl.. 
301,25ttJI)2.l5»3.., 
199.25«D0.15»2.. 
273.25«74.1W2.„ 
3(I1,25«302.15(*3.., 



Adud Rl 1 ;Uud RI2 

I I.ILWHJiaMMtgHilkH*M i 

0,44#13,52#4.48... 0.43#13.7#4.54»... 

D.07l«a.53JI2.31«2.... 0.07((a.5((2,3fl2.3... 

027#HXI#31.06fl _ 0.26B1IB«31.14# __ 

027061,52013.38... 025060.48013,18,.. 

0.15#HXI#17,3W... 0.16»1K»17,41»... 

0.45041 ,25#10.57... 0.47»42,91»M.99... 

0.05»27,44#7,2S... 0.05S27.%«7.45.., 

0.24#1fflW17.54#... 023f10(W17.5ie... 

O.OW15,CI4»12,04... 0.1104727012,65... 

0,Oa#39,S9#ra.53... 0.11B41).46#1D.7... 



Fisgnierls 



RxedMsie Dde 



0.43#13.S4»4.55... 
0.04HS,5etl2.3tl2.... 
0.27W10QS31.13S... 
0.26061.53013.46.. 
0.17fflKB17.42»,.. 
0.48044.73(11.32.. 
0.06O2e.31«7.54.., 
0.2701 KB17.46S... 
0.15045.99012.36.., 
O.DSa4D.D7S10.ei.., 



0,23S1IXW14.83tl 3« 



0.37»65.%lt14.78.. 
0.06043.93*9.94... 
0.ia#11)IW20.33#.. 
ai70S2.01O13.45.., 
0.1501 01W17.620.., 
0.65045.9011.78... 
0.06i)29.731f7.92... 
0.210100S1775S... 
0.0SO45.39O12.33,, 
0.06O41.94S11.19,., 



30 


20 


IDS 




30 


20 


130 


O6.11.201308;4 


30 


30 


ISO 


06.11.201308:4,, 


2tl 


10 


EGO 


06.11.201308:4. 


20 






0«,11.201308;4, 


eo 


50 


1130 


0«.l 1.2013 08:4,. 


GO 


50 


1130 


06.11.2013 08:4. 


M 


20 


12S0 


O«.11.201308:4,, 


ES 


50 


9S0 


l]€.11.201308:4. 


60 


50 


102O 


06.11.201308:4,. 


eo 


20 


1080 


06.11.201308:4,, 



Figure 5. Isotopo data format parser. GUI. This is the main interface of the module, which presents the converted data from Excel data files. 
Moreover, it provides options at the top right to open, refresh or load data into the database, export data into a file, delete data and close the applica- 
tion. The top left options are to browse the data analyser, database manager and data viewer. 



contains the information about metabolites, mass-to-charge 
ratio values with actual relative intensity values. 
Supplementary Table S2 contains the information about 
similar metabolites and mass-to-charge ratio values includ- 
ing standard relative intensity values. Suppleinentary Table 
S3 contains the essential inforination about the number of 
fragments, atoms, mass values and groups. An example 
Microsoft Excel data file is provided in Supplementary 
Material ('*.xlsx' are used in 64-bit operating systems with 
Microsoft Office version greater than 2003, and '*.xls' are 
used in 32-bit operating systems with Microsoft Office ver- 
sion 2003 or older). 

Our developed data classifier intelligently reads all 
tables and transfers the data into our software (see 
Supplementary Material for detailed examples). 

The user can not only read and validate the data using 
Excel data files but also edit them using the Isotopo data 
format parser. The user can also manually convert the data 
into the "''.isx' format. Later, the user can export converted 
data into the data files ("''.isx') as well as directly load 
them to the connected database server, which can be easily 
accessed by the database manager for further editing, etc., 
and data analyser for further analysis [just follow the for- 
mat of data given in the provided Excel sheet (attached in 
Supplementary Material)]. 



Raw data processing by the Isotopo software 

Different calculation algorithms for MIDA have already 
been given (4-6, 16-18). MIDA has been quantitatively 
validated and compared by independent methods, includ- 
ing a biosynthesis polymer measurement (1, 2, 7, 8). A bi- 
nomial expansion is used for the measurement of natural 
abundance values. An abundance matrix is drawn and 
multiple regression analysis is performed. As a specific al- 
gorithmic improvement to previous software including our 
own (9), we introduce an improved partial least square cal- 
culation and algorithm for the measurement of relative in- 
tensity values with respect to each m/z values. Using 
estimated relative intensity values, there is a newly drawn 
abundance matrix and pseudo-inverse matrix calculated: 
we have estimated actual values and percentages of relative 
abundances, fractional molar abundances and minimum 
values with respect to the number of fragments (see 
Supplementary Material for details). This whole procedure 
is repeated twice to obtain precise values. A third (op- 
tional) iteration validates results and convergence. Using 
resultant natural, relative and fractional molar abun- 
dances, absolute ^ 'C enrichments, mean and standard devi- 
ations are measured. 

The reliability in the results produced by this technique 
and the Isotopo application depends on a number of 
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factors. At first, the analysis is based on the assumption 
that the fragmentation patterns for all heteroatom isotopes 
are identical (i.e. no differential isotope effect), the relative 
abundance (actual) values of the isotopes are known and 
either the natural abundances are known or measured in 
some way. For optimal results, pre-filtering by the vendor 
software in the instrumentation to identify peaks, filtering 
out solvent contamination and enhancing the signal to- 
gether with the new Isotopo software for further process- 
ing and calculations, is critical, as tested in various 
metabolic pathway analyses using different pathogens. 
A tutorial in the Supplementary Material shows that 
Isotopo is easily used and handled. 

Isotopo is capable of analysing three actual relative in- 
tensity values together with the use of standard relative in- 
tensity values against mass-to-charge ratio values, and, in 
return, not only estimates the abundances for each used ac- 
tual intensity value but also the averages of relative and 
natural abundance. Importantly, besides the improved al- 
gorithm for isotopologue value calculation, Isotopo is a 
database application with the inclusion of the new 
database management system, data format parser and inte- 
grated product line architecture, whereas in all aforemen- 
tioned tools there is no database management system and 
utility to automatically convert the data from other for- 
mats to the processable data format. 

Conclusion 

Isotopologue distributions from MS measurements are 
rapidly translated into a quantitative pathway analysis 
applying the open-source and well-designed database ap- 
plication 'Isotopo'. The software facilitates the usage of 
MS-based labelling experiments for a broad range of po- 
tential users interested in the metabolism of bacteria, host 
cells and other organisms. 

Supplementary Data 

Supplementary Data are available at Database Online. 
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